Using Semantically Annotated Corpora to Build Collocation Resources
نویسندگان
چکیده
We present an experiment in extracting collocations from the FrameNet corpus, specifically, support verbs such as direct in Environmentalists directed strong criticism at world leaders. Support verbs do not contribute meaning of their own and the meaning of the construction is provided by the noun; the recognition of support verbs is thus useful in text understanding. Having access to a list of support verbs is also useful in applications that can benefit from paraphrasing, such as generation (where paraphrasing can provide variety). This paper starts with a brief presentation of the notion of lexical function in Meaning-Text Theory, where they fall under the notion of lexical function, and then discusses how relevant information is encoded in the FrameNet corpus. We describe the resource extracted from the FrameNet corpus.
منابع مشابه
Chinese Learning of Semantical Selectional Preferences Based on LSC Model and Expectation Maximization Algorithm
Aiming at the situation of current Chinese language resources shortage ,this paper proposes semantically selectional preferences of unsupervised learning method, and presents a strategy of obtaining verbnoun semantic collocation in Chinese. An approach of Chinese semantic preference learning, which is based on Latent Semantic Clustering model and Expectation Maximization Algorithm. First, the p...
متن کاملExploiting parallel texts in the creation of multilingual semantically annotated resources: the MultiSemCor Corpus
In this article we illustrate and evaluate an approach to create high quality linguistically annotated resources based on the exploitation of aligned parallel corpora. This approach is based on the assumption that if a text in one language has been annotated and its translation has not, annotations can be transferred from the source text to the target using word alignment as a bridge. The trans...
متن کاملThe contours of a semantic annotation scheme for Dutch
The creation of semantically annotated corpora has lagged dramatically behind. As a result, the need for such resources has now become urgent. Several initiatives have been launched at the international level in the last years, however, they have focussed almost entirely on English and not much attention has been dedicated to the creation of semantically annotated Dutch corpora. The Flemish-Dut...
متن کاملReport on the annotation of semantic roles - TR7
The creation of semantically annotated corpora has lagged dramatically behind. As a result, the need for such resources has now become urgent. Several initiatives have been launched at the international level in the last years, however, they have focussed almost entirely on English and not much attention has been dedicated to the creation of semantically annotated Dutch corpora. The Flemish-Dut...
متن کاملWebBANC: Building Semantically-Rich Annotated Corpora from Web User Annotations of Minority Languages
Annotated corpora are sets of structured text used to enable Natural Language Processing (NLP) tasks. Annotations may include tagged parts-of-speech, semantic concepts assigned to phrases, or semantic relationships between these concepts in text. Building annotated corpora is labor-intensive and presents a major obstacle to advancing machine translators, named entity recognizers (NER), part-ofs...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008